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1. Introduction 

Given two functions, we are interested in comparing the respective persis- 
tence diagrams. The persistence diagram is a set of points in the Cartesian 
plane used to describe births and deaths of homology groups as we iter- 
ate through sublevel sets of a function. There are many ways that we can 
compare two diagrams, including matching points between the diagrams. 
However, finding a meaningful matching (see Section [3]) is a difficult task, 
especially if we are interested in capturing the relationship between the un- 
derlying functions. 

As stated in the proposal, the goal of this Research Initiation Project is 
to gain an intuition for persistence and homology, as well as to understand 
the current state of research in these fields. I accomplished this goal by 
investigating the following problem: For a 2-manifold M, suppose we have 
two continuous functions /, g: M — > R. We can create the persistence dia- 
grams for / and for g. If we know the relationship between / and g, we can 
make informed decisions when matching points in the persistence diagrams. 
In particular, we are interested in the case where / and g are homotopic. 
Given a homotopy between / and g, we can create a vineyard of the per- 
sistence diagrams. Then, we use the vines in the vineyard to help make an 
informed matching of the points in the persistence diagrams. 

In this paper, we discuss two known methods of matching persistence 
diagrams, by measuring the bottleneck and the Wasserstein distances. Al- 
though stability results exist for matching persistence diagrams by mini- 
mizing either the bottleneck or the Wasserstein distance, these matchings 
are made without consideration of the underlying functions / and g. If we 
are able to create a continuous deformation of the function / to g, then we 
can use this additional information to aid in the matching of the points in 
the persistence diagrams. As a result, the matching obtained will be based 
on the underlying functions. We look into alternate way of measuring the 
distance between the persistence diagrams for the functions by assuming 
that there exists a homotopy between the functions. We create a homotopy 
that we call the heat equation homotopy and measure distances between the 
persistence diagrams for / and g by using these homotopies to aid in the 
pairing of points in the persistence diagrams of / and g. Then, we turn to 
analyzing an example of the heat equation homotopy and discuss various 
interesting patterns. 



2. Computational Topology Preliminaries 

Observing patterns and features in data sets is a common goal in many 
disciplines, including biology. Extracting the key features from a noisy data 
set can be an ambiguous task, and often involves simplifying and finding 
the best view of the data. Computational topology, and more specifically 
persistent homology, is a tool used for data analysis. Here, we give a brief 
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review of the necessary background of computational topology, but refer you 
to [10], [UJ and [13] for more details. 

2.1. Homology. Let X be a simplicial complex of dimension d. For p G N 
and p < d, the symbol X p will denote the power set of all p-simplices in 
X. Each set of X p is called a p- chain. The chain group C p is defined by 
the set under the disjoint union, or symmetric difference, operation. 
This operation can be interpreted as addition modulo two. The group C p is 
therefore isomorphic to Z2 to some non-negative integer power. All algebriac 
groups in this paper are vector spaces over Z2. Consider the boundary 
homomorphism : 

dp : Cp — > Cp_i, 

that maps the p-chain a G C p to the boundary of a, a chain in C p _i |13j . 
The p t/l homology group of X, denoted H p (X), is defined as the kernel of 
d p modulo the image of d p+ \: 

H P (X) = Ker(c>p)/Im(<9p + i). 

The kernel of the homomorphism d p is the set of elements in the domain 
that are evaluated to zero (the empty set) and the image of d p+ \ is the set 
of elements of the form 9 p +i(x), where x is in the domain Cp+i: 

Ker(dp) = {a G C p \d p (a) = 0} and 

Im(<9p+i) = {a G Cp|3a' G C p+1 3 d p+1 (a') = a}. 

The p th Betti number, ft p , is the rank of the p th homology group of X. By 
definition, the rank of a group is the (smallest) number of generators needed 
to define the group up to isomorphism. Since we are concerned with groups 
with Z2 coefficients, the rank uniquely defines the group up to isomorphism. 
For example, the group with three generators is Z| = Z2 © Z2 © Z2 and the 
group with n generators is Z?;. 

2.2. Persistent Homology. Now, we define persistent homology for func- 
tions from R to R. The complete discussion of the extension of these ideas 
to higher dimensions is found in [10J and [11] . We present a simplified set- 
ting to focus on the relevant concepts, while avoiding the complications that 
arise in the general setting. 

Suppose C is the graph of /: R — > R. We can think of / as the height 
function on C. Now, we characterize the topology of the sublevel set R{ = 
/ _1 ((— 00, s}), and we monitor how the homology groups change as s goes 
from negative infinity to infinity. The zeroth persistent homology group, 
denoted Hq(M.{), will change whenever s is a local maximum or a local 
minimum of the function /. The maxima and the minima are where the 
Betti numbers as well as the homology classes change for functions from R 
to R. A critical point is defined as the values r G R where the derivative 
is zero: ^f(f) = 0. Then, s = f(r) is called the critical value at r |18j . 
A Morse function defined over a subspace of R is a smooth function, such 
that no two critical values share a function value, and the second derivative 
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at each critical value is non-zero. For a Morse function /, the critical points 
are the set of r £ M with s = f(r), such that a Betti number changes by 
only one from R f s _ e to R f s+e for every sufficiently small value of e > 0. If the 
sum of the Betti numbers increases, we call r a positive critical point. If the 
sum decreases, then r is a negative critical point. 
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-\ 1 1 1 1 1 1 1 1 - n 1 1 1 1 1 1 1 1 ' 

012345678 012345678 

Figure 1. On the left, we see the graph of a function / 
in R 2 . On the right is the corresponding persistence dia- 
gram, Dgmo(/). Each point is drawn with multiplicity one. 
The birth at the point (7, 1) remains unpaired. 

As s increases from negative infinity, we label each new component with 
the positive critical value that introduces the component. We pair each 
negative critical value s with the most recently discovered unpaired positive 
critical value representing the components joined at s. This will lead to 
the diagram Dgmo(/) as demonstrated in Figure HJ Consider one pair: a 
positive critical value that was introduced at time s = s% and a negative 
critical value that was introduced at time s = S2, where s± < S2- This pair 
is represented in Dgmo(/) as the point (si, S2). The persistence of that pair 
is equal to the difference in function values: S2 — s±. In Figured! there are 
three pairs of points and one positive critical value that remains unmatched. 
This value represents an essential homology class. It would be paired if we 
were to consider the extended persistence diagram as presented in [6]. 

2.3. Relationships between Diagrams. We now turn to looking at two 
functions. Two functions are called homotopic if there exists a continuous 
deformation of the first function into the second. We are interested in creat- 
ing a homotopy between / and g in order to observe how the corresponding 
persistence diagram changes through time. More importantly, we use the 
homotopy when matching points in the persistence diagrams. Before pro- 
ceeding, let us formally define a homotopy and give an example that we will 
use later. 
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Definition 2.1 (Homotopy). We say that /, g: M — > R are homotopic if 
there exists a continuous function F : Mx [0, 1] — > R such that F(x, 0) = f(x) 
and F(x, 1) = <?(x), for all x E M. We will denote the homotopy F(x,t) 
by ft(x). 

Example 2.2 (The Straight Line Homotopy). The straight line homotopy 
interpolates linearly from each point in the continuous function / to the 
corresponding point in the continuous function g. We can write ft(x) = 
(1 - 1] )■ f(x) + t ■ g(x) for all t G [0, 1] and all x € M. 

For /, g : M — > R, we have a diagram for each integer p 6 {0, 1, 2}. Now, 
choose a value of p and assume that we have a homotopy (not necessarily the 
straight line homotopy) from g to /. Choose r time-steps of the homotopy, 
< ti < . . . < t T = 1. Let to = 0, so that Dgm p (/o) is the initial persis- 
tence diagram. At each time tj, we have a persistence diagram Dgm p (/^.). 
Moreover, given Dgm p (/j j ), we can compute Dgm p (/j j+1 ) in time linear in 
the number of simplices of the filtration by using a straight line homotopy 
between f tj and ft ]+1 , as described in [8]. In Section [572], we describe how 
to use information obtained from this computation in order to pair points 
in the persistence diagrams. Then, we stack the diagrams so that Dgm p (/i) 
is drawn at height t in R 3 and connect the points in the diagrams by curves 
formed by the line segments connecting matched points in consecutive dia- 
grams. The result is a piecewise linear path between points in Dgm p (/o) 
and points in Dgm p (/i). If we let the time difference between any two con- 
secutive diagrams approach zero, the piecewise linear path becomes a set 
of continuous curves by the stability result for the straight line homotopy 
(see Section [3. 4p . Each curve that traces the path of an off-diagonal points 
through time is called a vine. The collection of vines is referred to as a 
vineyard [8j [10]. We pair the endpoints of each vine to obtain a matching 
of the persistence points in Dgm p (g) with the points in Dgm p (/). 

3. Matching Persistence Diagrams 

To match the points in Dgm p (g) and Dgm p (/) without considering the 
homotopy, we look at methods of finding matchings on bipartite planar 
graphs. We begin this section with a few definitions. A matching is a 
bipartite graph P = (A U B, E) where the vertex sets A and B are disjoint, 
the edges are between one vertex in A and one vertex in B, and each vertex 
is incident on at most one edge. A matching is maximal if the addition of 
any edge would result in a graph that is no longer a matching. A matching 
is perfect if every vertex is incident upon exactly one edge. In other words, 
it is a matching where there does not exist an unmatched vertex. 

We consider two different methods for measuring the distance between the 
persistence diagrams, Dgm p (/) and T>gm p (g). The goal is to match every 
point in A =Dgm p (/) to a point in B =Dgm p (g) in order to minimize the 
cost of the matching. One method for determining this cost is to consider 
the bipartite matching problem, where for every a G A and b € B, the 
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edge {a, b} has a cost C(a, b). Then, the cost of a perfect matching is the sum 
(or the maximum) value of the edge costs. The cost for a matching that is 
not perfect is infinite. To resolve the issue where the number of off-diagonal 
points in both diagrams is not equal or the diagrams are dissimilar, we allow 
an off-diagonal point to be matched to a point on the line y = x. The use of 
the bottleneck and Wasserstein matchings for this purpose is presented in 
Chapter VIII of |10| . As we will show, each of these methods entails some 
notion of stability for the persistence diagrams. That is, we can bound the 
bottleneck and Wasserstein distance between persistence diagrams by the 
distance between / and g, for some class of functions. 

3.1. Two Cost Metrics. Consider the matching problem where we must 
match the elements of set ACM 2 with elements of a second set B CI 2 , 
and where there is a cost associated with each pair (a, b) £ Ax B. We will 
define both the bottleneck and the Wasserstein costs of a matching. Later, 
we will use these distance metrics as the costs to find the bottleneck and the 
Wasserstein matchings. Let C(a,b) be the distance between the points 
a and b; that is, C(a, b) = max{|a x — b x \, \a y — b y \}. 

Definition 3.1 (Bottleneck Cost). The bottleneck cost of a perfect match- 
ing P is the maximum edge cost: 

max C(a,b). 

(a,b)GP 

Definition 3.2 (Wasserstein Cost). The degree q Wasserstein Cost of a 
perfect matching is sum of the edge cost over all edges in the matching: 

f £ c ^ h ) q 

\(a,b)eP 

3.2. The Bottleneck Matching Criterion. The bottleneck matching min- 
imizes the bottleneck cost of a matching over all perfect matchings of A 
and B: 

Woc(A,B) = min max C(a,b). 

P (o,6)eG 

The use of the notation Woo to denote the bottleneck distance will become 
clear in the next section. 

If \A\ = \B\ = n, then a maximal matching can be found in 0(n 5 / 2 ) 
using the Hopcroft-Karp algorithm [15]. If we mimic the thresholding ap- 
proach of the hungarian method [16j, then the bottleneck solution can be 
found in 0(n 5 / 2 log n). Since A and B are sets of points in the plane, we 
can improve the computational complexity of determining the matching un- 
der the bottleneck distance. Efrat, Itai, and Katz developed a geomet- 
ric improvement to the Hopcroft-Karp algorithm with a running time of 
0(n x ' 5 log 2 n) [12] . 
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3.3. The Wasserstein Matching Criterion. In the Wasserstein Match- 
ing, we seek to minimize the maximum degree q Wasserstein cost over all 
perfect matchings: 

W q (A,B) = mm I ^ C(a,b) q 

\(a,b)&M 

Although for small values of q, the bottleneck and the Wasserstein criteria 
may produce different matchings, if we take the limit as q — > oo, we see that 
the Wasserstein criterion approaches the bottleneck criterion. 

The Hungarian method computes the Wasserstein matching in 0(n 4 ) 
computational complexity |16j . In [19] . Vaidya maintains weighted Voronoi 
diagrams for an 0(n 2 ' 5 log n) computation of this matching. Further im- 
provements were made by Agarwal, Efrat, and Sharir pQ. They utilize a 
data structure that improves the running time to 0(n 2+e ) for the min- weight 
Euclidean matching. 

3.4. Stability Theorems. A distance metric (and corresponding match- 
ing) is stable if a small change in the input sets A and B produces a small 
change in the measured distance between the sets. The property of stability 
is not obvious and sometimes not true. If a matching is stable, however, we 
can use it to create a vineyard from a smooth homotopy. 

Let Pqo be the matching obtained using the bottleneck criteria; that is, 
the matching is the matching that minimizes the bottleneck distance. 
Let /, g be tame functions. This means that the homology groups of M{ 
and of M§ have finite rank for all s. In addition, only a finite number of 
homology groups are realized as H p (M f s ) or H p (M 9 s ) $J\. The function 
difference ||/ — g\\oo is the maximum difference between the function values: 

11/ -fflloo = sup \f(x) -g(x)\. 

The Stability Theorem for Tame Functions, which gives us that the bottle- 
neck distance Woo(Dgm p (/), Dgm p (g)) is bounded above by ||/ — (/Hoc [5]. 
The proof of this theorem presented in [8] uses the following result: 

Stability Result of the Straight Line Homotopy. Given ft(x), the 
straight line homotopy from g to /, we know that there exists a perfect 
matching P of the persistence diagrams for / and g such that the bottle- 
neck cost of P is upper bounded by the distance between / and g: 

max C(a,b) < \\f - g\\oo- 

(a,b)£P 

We will explain how to obtain this matching P in Section 15.31 

The Wasserstein distance is stable for Lipschitz functions with bounded 
degree k total persistence. This is proven in [7\. If we relax either of these 
two conditions, then the Wasserstein distance becomes unstable for two 
functions / and g where [|/ — gjloo < e as e approaches zero. 
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avg(/) 
avg(^) : 



4. The Heat Equation 

The heat equation is a mathematical description of the dispersion of heat 
through a region in space. We start with the initial reading (or a guess) of 
the temperature throughout an enclosed space, say an empty room. After 
infinite time and given no external changes, the temperature at each point 
in the space will converge to the average initial temperature. 

4.1. Dispersing the Difference. Let the manifold M be a closed square 
subset of M 2 . Then, we can think of the functions /, g : M — > R as surfaces 
in M 3 . Now, let uo be the difference g — f ■ If Uq(x) = for all x € M, then 
we have / = g. Otherwise, define the average of a function as the integral 
divided by the area. 

f M f(x)dx 
area(M) 

and 

Im 9{x)dx 
area(M) 

Then, we can calculate the average value of uq over the domain M by sub- 
traction: 

avg(u ) = avg(g) - avg(/). 
We apply the heat equation to no and obtain u(x,t) where u(x,0) = uq(x) 
and limt->.oo u(x,t) = c. For sake of simplicity, we will assume that the 
average of Ko vanishes. If this is not the case, we can impose this condition 
by setting g = g - avg(u ). 

Now, we can observe the difference u disperse through time until u be- 
comes the zero function. Similarly, we know that f(x) +u(x,t) will go from 
g to /. Although the value of u(x, t) approaches zero for all x as t increases 

lim u(x, t) = 0, 

t— >oo 

we will stop at a time T when u(x,T) £ (— e,+e) for all x € M and for 
some e > 0. Then, the function / + u goes from g to a function close to /. 
Furthermore, the manner in which the function / + u changes is dictated by 
the heat equation. 

4.2. The Continuous Heat Equation. Here, we describe the heat equa- 
tion as it applies to a continuous function. For more details, please refer 
to [21 H] ■ In Section 14.31 we wm modify these equations to approximate the 
solution in the discrete case and we introduce the heat equation homotopy. 

Let {&i, 62} be an orthonormal basis for M. The general form of the heat 
equation satisfies the following conditions: 

/- n du , d 2 u , . d 2 u . . 

(1) _(,,*) -_ (M )__ (M ) = 

and the initial condition: 

(2) u(x, 0) = g{x) - f(x),x G M. 
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If we can solve this partial differential equation (PDE), we obtain u(x,t) 
defined for all x £ M and t > 0. Typically, the heat equation has the 
additional constraint that u(x, t) is constant with respect to t for all x E cM. 
However, we are interested in the case where u(x, t) is heat conserving. That 
is, we would like avg(u tl ) =avg(u t2 ) for all t\, £2- As we will show, in order 
to obtain this goal, the values on the boundary will reflect the values interior 
to the boundary. 

The equation in (pQ) describes how the temperature changes with respect 
to time and space. We note that if we impose the condition ^f(x, t) = 0, then 
the heat equation will not change with respect to time, and Equation ([T]) 
becomes Laplace's equation, Au(x, t) = 0. This is known as the steady-state 
heat equation and will have a unique solution. The iterative methods that 
we look at in Section T4.3I aim at finding an approximation of u(x, t) for this 
problem. The final solution will be constant with respect to time, and so 
we say it is approaching the steady-state. We are interested in following the 
behavior of heat equation as it approaches the solution to the steady-state 
heat equation. 

4.3. The Discrete Heat Equation Homotopy. Solving a partial differ- 
ential equation is not a simple task. Thus, we must defer to numerical 
methods to estimate this solution, which require spatial and temporal dis- 
cretization pQ. In the following computations, we use a regular grid decom- 
position of M = [0, l] 2 , writing Xj = (i — l)h and yj = (j — l)h, where 
h = l/(n — I) for some fixed integer n > 2. In the next section, we explain 
how to apply the heat equation on different topologies. 

The first step in creating the heat equation homotopy ft(x) is to compute 
ut(x) = u(x,t) over the discretized domain. There are three issues that can 
arise when using the continuous formulation of the heat equation described 
in Section 1431 

1. We need to solve the partial differential equation presented in Equa- 
tions ([I]) and ©. 

2. The partial derivative ^r{x) for b{ is not well defined over a discrete 



3. The solution u(x,t) is defined for all non-negative t, but a homotopy 
must to be defined for t G [0, 1]. 

In order to resolve these issues, we apply temporal and spatial discretization 
as well as scaling. The goal is to obtain a homotopy ft(x) from g to / using 
the heat equation solution u(x,t). Below, we describe one such resolution; 
note, however, that other approaches may be taken. 

4.3.1. Mathematical Description of the Heat Equation. Let us recall the 
steady-state heat equation over M 2 : 



domain. 



(3) 
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In this equation, we are using {61,62} as the standard basis vectors for M 2 . 
At each mesh point x = we employ the Taylor polynomial in the 

variable b\ to obtain an approximation of the second derivative with respect 
to b\: 

d 2 u u((i + l,j),t)-2u((i,j),t) + u((i-l,j),t) 

(4) = p , 

where h is the spatial step size. Similarly, we also have an approximation 
for the second derivative with respect to 62: 

d 2 u u((i,j + l),t) -2u((i,j),t) + u((i,j -l),t) 

(5) W <( hJ ),t) = - 2 . 

To simplify notation, we will now use Xjj to denote (i, j). If we plug © 
and ([5]) into ([3]), then we obtain the following equation: 

(6) 4u(xij,t) -u(x i+ltj ,t) -u(xi-i t j,t) -u(x itj+ i,t) -u(xij-x,t) =0 

Thus, the approximation to the heat equation is made by looking at local 
neighborhoods for each point Xij. Figure [2] highlights the four neighbors of 
the mesh needed to compute an approximation to the heat equation. The 




Figure 2. In the center, we have the white dot representing 
the mesh at position x = (i, j). The mesh points highlighted 
in pink are those whose values contribute to the estimate of 
the heat equation at x. 



neighborhood of a point, Nhd(i, j), is defined to be the set of local neighbors 
of the point Xij. 

(7) Nhd(i,j) = {(i,j±l),(i±l,j)} 

We have n 2 equations of the form presented in ([6]), one for each point in the 
n x n mesh. We relabel the mesh points in column-major order in order to 
use one index instead of two: V{(j-i)n+i} := u(xij,t). Then, we may express 
the n 2 linear equations in matrix-vector form, Av = 0. 
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4.3.2. An Iterative Algorithm for Linear Systems of Equations. As we have 
shown above, solving the discrete heat equation finds a solution v to the 
linear system of equations Av = 0. Above, we have described how to con- 
struct the matrix A, as it is the coefficient matrix for the system of linear 
equations. In the above description, A is equal to L n , the Poisson matrix of 
order n. We note here that this matrix is n 2 x n 2 . We can write L n = D — N, 
where D is the diagonal matrix 4 • / and A is a matrix with O's on the di- 
agonal and with only 1 as the non-zero entries of the matrix. Sometimes we 
refer to D as the valency matrix, since it expresses the degree of each mesh 
point. The matrix N is symmetric and we call it the neighborhood matrix 
since the non-zero entries in row % correspond to the neighbors of the mesh 
point Vi [3]. That is, N(i,j) = 1 iff V{ and Vj are adjacent. 

The iterative algorithm can be defined by these matrices. We want a 
solution of the form Av = 0, which means (D — N)v = 0. We can re-write 
this so that Dv = Nv. And, the iterative algorithm can immediately be 
seen: 

(8) V new = {D~ l N)v. 

In the original formulation, this translates to: 

u t +i{x) = | 

yeNhd(x) 

where the neighborhood Nhd(x) is the neighborhood of x defined by Equa- 
tion ([7]). This iterative method is known as Jacobi iteration. 

4.3.3. Creating the Homotopy. We continue the process of obtaining ut+i 
from ut until we have reached a halting point, where \ut+i(x) — ut(x)\ < e 
for all x E M and for some predetermined value of e. We will let T be 
the maximum time computed in the iterative method. Thus, we have a 
function u : M x [0, T] — > K. We reparameterize u(x, t) with respect to t in 
order to obtain u(x, t) = u(x, t ■ T) defined over the domain M x [0, 1]. Then, 
the heat equation homotopy ft can be defined by the equation: 

(9) f t (x) = f(x) + u(x,t). 

Notice that fo(x) = g(x) and f\{x) ~ f{x) with T sufficiently large since 
lim f\{x) = lim fix) + u(x,T) = f{x). 

T— too T^oo 

By using this homotopy, the initial difference between / and g disperses. 
Although u(x, t) approaches a constant function as t approaches T, inter- 
esting things can happen along the way. For example, critical values can be 
created. 

Example 4.1 (Mountain and Bridge). Suppose we have two cones con- 
nected by two pentagons as shown in Figure El We will call this surface S. 
Now, consider the height h : R 2 — > R defined to be the height of the surface 
at (x, y) below S and zero elsewhere. We assume that the base of the cones 
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Figure 3. The function used in Example 14, 11 



and the pentagons is in the xy-plane. There is one dimension one critical 
value, the pinnacle of the pentagons. If we apply the heat equation to this 
surface, there will soon be at least three dimension zero critical values: one 
corresponding to the original critical value and two from the cone tips. 

4.4. Considering Different Topologies. Here, we consider implementing 
different topologies for the square domain. Each mesh element has four 
neighbors, corresponding to above, below, left, and right. The topologies 
are determined by which vertices we define to be neighbors. We consider 
four topologies: the square, the torus, the Klein bottle, and the sphere. The 
square topology, used when describing the iterative heat equation, is the 
most basic. 

Definition 4.2 (Square Topology). The neighborhood of is: 

Nhd(i,j) = {(i,j± l),(i±l,j)}, 

provided that these elements are within the domain. 

Each vertex can have two, three, or four neighbors, resulting in a sys- 
tem of equations that does not conserve heat. Let V be the set of vertices 
in M, and define the total heat to be the sum of the heat over the entire 
domain: YlxeV u ( x )- Then, the total heat is not preserved between itera- 
tions. To fix this, let x be its own neighbor for every neighbor that x is 
missing. Hence, each x G V is the neighbor of four other vertices and has 
four neighbors itself. For the second step, we have 

£«(<M + i) = £^^ = E«0M). 

x^v xev x&v 

The four edges of the square create the boundary on the square topology. 
If we identify the boundary edges in pairs, we can then create a surface 
without boundary. One way to do this is to create a torus from the square. 
Formally, we define the torus topology as follows: 
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Definition 4.3 (Torus Topology). Given annxn mesh, the neighborhood 
of is: 

Nhd(i,j) = {(i,j± n l),(i ± n l,j)}, 
where addition and subtraction are calculated modulo n. 

The torus topology can also be described by Figure [U By gluing the 




Figure 4. The torus is the quotient of the unit square by 
gluing together opposite sides as prescribed by their orienta- 
tion. 



a if 




a b 




(a) The Klein bottle 



(b) The sphere 



Figure 5. The figures above illustrate how the boundaries 
of the square are glued to obtain the Klein bottle and the 
sphere. 



two a edges together and the two b edges together, we obtain a manifold 
without boundary. In this manifold, each vertex v is the neighbor of four 
other vertices, and has four neighbors that contribute to the new value for 
that vertex. Similarly, we can define the Klein Bottle and the Spherical 
topologies as depicted in Figure 03 
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4.5. Convergence. When using iterative algorithms, we worry about how 
long finding the solution (or something close to the solution) will take. In 
this section, we discuss one metric that measures the convergence of iterative 
algorithms. 

If = u(x, ifc) is the solution at stage k, then the solution at stage k + 1 

is 

x k+i = }_ Nx (k)_ 
4 

We now see that 

(10) Dx( k+1 > = NxW 
and for the solution x°°: 

(11) Dx°° = Nx°°. 
If we subtract (fTQ|) from (fTTj) . then we obtain 

DE k+l = NEk, 

where E\. = x°° — x k is the error of the k th estimate x^ . Hence, 

E k+1 = D- l NE k = {D- x N) k + x Eo. 

The action of M = D~ l N on the initial error determines whether the er- 
ror of the solution x will increase or decrease. Now, if M has n 2 non- 
degenerate eigenvalues and n 2 linearly independent eigenvectors, we can 
write M = VEV , where £ is the diagonal matrix of eigenvalues. Since 
M k = VYj V~ x , we use the spectral norm of the matrix M to determine if 
the error is compounding or decreasing. The spectral norm, p, of a matrix 
the largest absolute eigenvalue of that matrix. If p < 1, then the iterative al- 
gorithm converges. If p > 1, then the iterative algorithm does not converge. 
For example, if we are using Jacobi iteration as described in Section 14.3.21 
then p is very close to one, and thus convergence is very slow p3]. Although 
this is undesirable behavior for finding a solution to the heat equation, for 
our purposes, it allows us to more closely examine the behavior of the heat 
equation. 

5. Vineyards of the Heat Equation Homotopy 

In Section 12.31 we defined homotopy and stated that we can stack the 
persistence diagrams associated with the homotopy to create a vineyard. 
In this section, we estimate the underlying vineyard for the heat equation 
homotopy by monitoring the transpositions in the filter. 

5.1. Turning a Mesh into a Filter. We begin with an n x n matrix of 
values. Although the heat equation computes neighbors based on a grid, we 
will be computing persistence using simplicial complexes. Thus, we trian- 
gulate the domain by adding in horizontal, vertical and diagonal edges, as 
well as the triangles formed by the voids. 
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Figure 6. The triangulation of a 4 x 4 grid. 

An n x n mesh gives us n 2 vertices, (n — l)(3n — 1) edges, and 2(n — l) 2 
faces. In Figure [61 we see the triangulation of a four-by-four mesh. Each 
simplex a is defined by the vertices that create it. In addition, the value of 
a simplex, f(cr), is the maximum function value of those vertices. Now, we 
order the simplices by the following two rules: 

1. If /(el) < /((T2). then o"i appears before simplex cr 2 . 

2. If r is a subsimplex of a, denoted r < u, then simplex r appears 
before simplex a. 

We observe here that the second rule does not contradict the first, because 
f( T ) < /(°0 whenever r is a face of a. The resulting ordering of the simplices 
is called a filter. The two rules imply that every initial subsequence of the 
filter defines a subcomplex of the mesh. Growing this initial subsequence 
until it equals the entire filter gives a sequence of simplicial complexes called 
the induced filtration. As with any sorting algorithm, to create a filter of m 
simplices will take 0(m log m) time. 

5.2. Computing Vines. Recall the heat equation homotopy from Equa- 
tion ©, 

ft = f(x) + u(x,t), 

which is defined over a finite number of time-steps: 0, t±,t2, ■ ■ ■ , ir = 1- 
Create a filtration for fo(x) as prescribed above. We use the filtration to 
compute the persistence diagrams, just as we used the sublevel sets in Sec- 
tion 12.21 We progress from one complex in the filtration to another by 
adding one simplex. The addition of this simplex can be the birth of a new 
homology class or the death of an existing homology class. After we have 
iterated through the entire filter, we have completed the computation of the 
persistence diagrams for time 0. 

To compute Dgm p (/ tl ), we could repeat the same process. However, if 
we use Dgm p (/( ), we can compute the new diagram in linear time and we 
can match points in the two diagrams [8]. 

5.2.1. Transpositions. Suppose we have two filters on m simplices, such that 
the filters are identical except that two adjacent simplices have swapped 
order. Then, we can write the first filter: 

Fl : (Ji, <7 2 , . . . , (Tj, CTj + i, . . . , cr m 



15 



and the second filter: 



F2 : 01,0-2, • • • • • • 



If we have the persistence diagram for the filter Fl, then we can com- 
pute the persistence diagram for the the filter F2 by performing one trans- 
position. A transposition updates the persistence diagram by changing 
the pairings of two consecutive simplices, if necessary. In the case that 
dim(<7j) ^ dim(<7j + i), the transposition does not affect the pairs in the per- 
sistence diagram. Thus, only the transposition of simplices with the same 
number of vertices can result in a pairing swap. One transposition can swap 
the births and deaths of at most two points in the persistence diagram. 



Figure 7. The filter of a simplicial complex is an ordering 
on the vertices, edges, and faces. The simplices are originally 
ordered o\ through a-j. Swapping o\ and a-i results in a 
pairing swap of type 1. Swapping 04 and 05 results in a 
pairing swap of type 2. Swapping o~ 5 and 0% results in a 
pairing swap of type 3. 

We must distinguish between nested and unnested persistence pairings. 
We say that two persistence points, (61, d%) and (&2,efo), are nested if the 
birth at 62 and the death at d<i occur after 61 and before d2- Assuming all 
events happen at distinct moments of time, this is equivalent to b± < 62 < 
d-i < d\. If we transpose <7j and crj+i, then the three types of pair-swapping 
transpositions are: 

1. The births of two nested pairs are transposed. In this case, o~i is 
associated with b\ in Dgm p (i ? l), and with 62 in Dgm p (i ? 2). 

2. The deaths of two nested pairs are transposed. In this case, o~i is 
associated with cfo in Dgm p (i ? l), and with d\ in Dgm p (i ? 2). 

3. The birth of one pair and the death of another, unnested, pair are 
transposed. In this case, the addition of o"j to the filtration created 
a death in Dgm p (i ? l), but a birth in Dgm p (i ? 2). 

Figure [7] illustrates the three types of transpositions that can be made. 
Although pair swaps of types 1 and 2 involve two persistence points in the 
same diagram, pair swaps of type 3 involve persistence points in diagrams of 
two consecutive dimensions. In each of the cases, the transposition results in 
a swap only if changing the order in which the simplices are added changes 
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the persistence pairing that is made. For a complete algorithm to compute 
the transpositions, please refer to [8]. 

From these transpositions, we can create the matching referred to in Sta- 
bility Result of the Straight Line Homotopy of I3.4L Suppose the transposi- 
tion of Oi and of Cj+i resulted in a pairing swap. Then, every persistence 
point is paired with itself in the matching, except for (&i, d\) and (62,^2), 
whose pairings are swapped. 

5.2.2. Sweep Algorithm. Changing one filtration into another may require 
more than one transposition. In order to compute the persistence diagram, 
we first create a topological arrangement and then use a sweep algorithm. 
In the Cartesian plane, write the filtration of ft horizontally. Below that, 




Figure 8. We create a topological arrangement by con- 
necting drawing a line between the vertices that represent 
the same simplex. By doing this, we find a finite number of 
crossings. Each crossing represents one transposition in the 
filter. 



write the filtration of ft +1 using the same simplex names. Then, we connect 
like simplices with a single curve (does not need to be straight). An example 
of this process is given in Figure After this arrangement has been created, 
an ordering on the transpositions can be found by topologically sweeping the 
arrangement, as presented in [9]. 

We compute Dgm p (/ ij+1 ) and keep track of the matching by progressing 
one transposition at a time in the order dictated by the sweep algorithm. 

5.3. Measuring Distance between Persistence Diagrams. Now, we 
assume that there exists a homotopy between functions / and g. Then, we 
have a vineyard that connects each point a in A =Dgm p (/) with a point 
b in B =Dgm p (<7). We will use this pairing as our matching and define a 
distance metric for it. 

Assume a and b are connected by the vine s : [0, 1] — > M. 3 , as described 
in Section 15.21 Since the vine was created from a homotopy, we will use the 
index t to emphasize that s(t) is a persistence point for the function ft(x). 
The velocity of the vine ^ will be integrated on [0, 1] in order to measure 
the distance traveled between a and b : 
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In order to obtain a distance between persistence diagrams we sum these 
distances over all vines in the vineyard V: 

(12) D fg = J2D s . 

s&V 

If s(t) is only defined for a discrete set of times tj with < i < T, then we 
obtain an alternate definition for Df g : 

(13) D f9 = Y, E ll*(*i)-*(*i-l)lloo. 

sGV ie(0,T] 

When a point a € ^4 enters the diagonal at t < 1, we pair a with the 
corresponding diagonal point, since the diagonal points can only occur as 
endpoints of a vine by definition of vine in Section I5.2L Then, we only 
measure the distance over the interval in which the vine is defined, [0,t). 
Symmetrically, we can have a diagonal point in A paired with an off-diagonal 
point in B. 

We are interested in understanding how this distance metric compares 
with the bottleneck and Wasserstein distances, as well as to understand the 
properties of this distance metric and homotopy matching. 

6. An Example 

In this section, we present the results from one example. Although the 
results from one example cannot be generalized, we are able to capture 
the different behaviors of the homotopy under the four topologies (square, 
sphere, torus, and Klein bottle). In the subsequent examples, M is a square 
closed region of M 2 . The functions / and g are approximated by a 101 x 101 
mesh. We obtain the function values from two gray-scale images of a bird 
and a flower respectively, as shown in Figure [9j The difference uq = g — f is 
computed. From here, we will find a homotopy from uq to the zero function. 
We note that this homotopy differs from the one previously defined, since 
we do not add / to the heat equation solution. In practice, we found that 
this homotopy more clearly displays the behavior of the heat equation. 

In Figures [T01I144 we see several stages of the heat equation homotopy 
using various topologies of the square. The persistence diagrams shown are 
combined diagrams for all dimensions. In Figure [T5j we look at the diagram 
for the first step of the homotopy, separated for dimensions p = and p = 1, 
under the Klein bottle and sphere topologies. We note that the points with 
the highest persistence appear in the dimension one persistence diagram. 
This is a property that remains true as the homotopy progresses. 

6.1. Analyzing Different Topologies. In the topologies without a bound- 
ary (torus, Klein bottle, and sphere), a new feature is created with a rela- 
tively high persistence. For example, if we start with a triangular region of 
high values against a border, as in Figure \16\ then there exists a 1-cycle in 
the sublevel sets. Note, however, that there is not a 1— cycle in any sublevel 
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(a) Bird (b) Flower (c) Difference 

Figure 9. The homotopy acts on the difference between the 
image of the bird and the image of the flower. The grayscale 
values of the image of the difference represent the values of 
the height function uq. In the images, the dark pixels cor- 
respond to the low values and the light pixels correspond to 
the high values. 




-200 -100 100 200 300 

Birth 



Figure 10. The persistence diagram of the difference func- 
tion. This is the diagram at Step of the heat equation un- 
der the square topology. The possible values are the integers 
in [—255,255]. The blue star drawn at height 300 represents 
the essential homology class. 

set of the square topology. Since we have created the four different topolo- 
gies by gluing the edges of the square together in various ways, different 
behaviors along these edges can be expected. We keep this difference in 
mind as we continue to look for commonalities and other differences caused 
by the adopted topology. 

The persistence diagrams for the torus and the Klein bottle topologies 
behave similarly. In most of the graphs in this section, the curves of the 
torus and the Klein bottle are usually parallel. This is an indication that 
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(a) Step 1 



(b) Step 10 



(c) Step 100 



Figure 1 1 . Persistence diagrams for the heat equation using 
square topology. 



200 300 



(a) Step 1 



(b) Step 10 



(c) Step 100 



Figure 12. Persistence diagrams for the heat equation using 
torus topology. 



(a) Step 1 



(b) Step 10 



(c) Step 100 



Figure 13. Persistence diagrams for the heat equation using 
Klein bottle topology. 



orientability has little effect on the heat equation homotopy. This does not 
come as a not a surprise, as we did not use the orientation when computing 
the heat equation. 

6.2. Duration of Vines. In Figure [17J we see the distribution of the vine 
lengths in the vineyard Dgmo (itt). The histogram is skewed right, since the 
mean is greater than the median. Table [T] confirms this observation. The 
same pattern is in fact observed under different topologies. When comparing 
the vine lengths of any two topologies, the Kolmogorov-Smirnov test for 
statistical difference with a = .05 fails to reject the null hypothesis that 
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(a) Step 1 



(b) Step 10 



(c) Step 100 



Figure 14. Persistence diagrams for the heat equation using 
spherical topology. 



(a) Klein Bottle, Dimension 



(b) Klein Bottle, Dimension 1 



(c) Sphere, Dimension 



-200 -100 



100 200 



(d) Sphere, Dimension 1 



Figure 15. In these figures, we separate the persistence dia- 
grams from step one of the homotopy using the Klein bottle 
and the sphere topologies. The diagrams for combined di- 
mensions are Figure [T3lfa) and Figure [T4Ta). 



the distributions are the same. On the other hand, if we remove the short- 
lived vines, then we start to see that comparing the Klein bottle and sphere 
topologies results in the rejection of the null hypothesis. However, this is 
not a strong enough indication that these distributions are different. The 
details of this statistical method are out of the scope of this paper, but can 
be found in [T7] , 
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Figure 16. A square with a high- valued (light-colored) re- 
gion on the boundary. 




Vine Length 

Figure 17. The distribution of the length of vines in the 
vineyard for Dgmn(iit) under the square topology. 

From these observations we can conclude that all topologies display a 
similar distribution of the length of vines, with many of the vines being 
short-lived. In addition, under the torus and Klein bottle topologies, a large 
number of vines span the entire vineyard. 

6.3. Monitoring Total Persistence. The degree q total persistence is 
the sum of the q th powers of persistence over all points in the persistence 
diagram. In Figure [T8l we have the graph of the degree one total persistence 

Table 1. Vine Length Statistics for Dgnin(ut) 





mean 


median 


mode 


s.d. 


square 


63.2 


15 


6 


132.8 


sphere 


69.4 


25 


6 


143.8 


torus 


79.8 


26 


500 


105.1 


Klein 


72.5 


21 


500 


101.4 
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Figure 18. Total persistence of degree one. 



o. 4-1 



Square 
— Sphere 
— Torus 
---Klein Bottle 



Square 
— Torus 
— Klein 



(a) Degree 2 



(b) Degree 3 



Figure 19. The declining total persistence of degrees two 
and three. In the graph for the total persistence of degree 
three, we omit the sphere topology. Relative to the other 
topologies, the values were very high. 



versus the iteration. Although the total persistence rapidly deteriorates 
initially, the decay slows down around iteration 100. - - We notice here 



Table 2. Vine Length Statistics for Dgmi(tit) 





mean 


median 


mode 


s.d. 


square 


119.6 


65 


6 


97.7 


sphere 


122.5 


56 


6 


94.2 


torus 


122.5 


70 


6 


137.1 


Klein 


126.2 


58 


6 


151.3 
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that the Klein bottle and the torus have an end behavior different than that 
of the sphere and the square, in that we do not see the total persistence 
approaching zero after 500 steps of the heat equation. - In these figures, the 
sphere behaves radically different than the other three topologies. 




100 200 300 400 500 

Iteration 

Figure 20. The mean absolute change between steps of the 
heat equation homotopy. 

6.4. Mean Absolute Change. Figure [20] shows that the mean absolute 
value of the change between steps of the homotopy decreases rapidly at 
first, then slowly. Recall that the values of the mesh points range from 
—255 to 255. At the first step, the mean absolute change is between 10 and 
15 (not shown in Figure [20jl . which is only a 2% — 3% initial change. The 
value, however, remains above 0.1 in all cases except for the square topology. 
Given the nature of the heat equation, we expect the values to decrease with 
respect to time. The small values for mean absolute change are in part due 
to the initial values of tin- The values of one vertex does not differ by a 
large amount from the values of its neighbors. We know that the iterative 
method chosen is slow to converge; however, this graph allows us to gauge 
how little change is occurring at each step of the homotopy. 

6.5. Counting Transpositions. The total number transpositions between 
steps of the heat equation versus time is shown in Figure l2"Tlfa), All four 
topologies follow a decreasing pattern that levels off, with the Klein bottle 
and the torus behaving distinctively different from the square and the sphere. 

We remark on several notable observations from these diagrams. The total 
number of transpositions is still significant when the heat equation algorithm 
reaches the halting condition. At the last step, the square topology makes 
over 400,000 total transpositions. The other three topologies make even 
more transpositions. 



2-1 




(a) Total (b) Type 1 Pair Swap 



(c) Type 2 Pair Swap 



Square 
- - Sphere 

---Klein 



(d) Type 3 Pair Swap 



Figure 21. The number of transpositions between steps of 
the heat equation homotopy. The top left shows the total 
number of transpositions and the remaining three figures are 
restricted to the three types of transpositions that result in 
pairing swaps. 



In Figure [211 (b),(c), and (d) to see a different pattern for the number of 
the switches resulting from the transpositions. Under the square topology, 
the number of pair swaps of types 1, 2, and 3 is down to the double digits 
after 500 iterations of the heat equation. We notice that Figures (b) and 
(c) are almost identical. This is not a surprising observation, since pairing 
swaps 1 and 2 (described in Section T5.2.ip are symmetric cases of two births 
of the same dimension or two deaths of the same dimension being transposed, 
resulting in a pair swap in the diagram. 

Three other patterns are worth noting in the graphs of Figure EJJ First, we 
see that the number of transpositions (of all kinds) rapidly decreases until 
iteration 25. At this point, the four topologies show different behaviors. 
Second, after iteration 200, the torus and the Klein bottle topologies have 
both leveled off to a constant function. Finally, we notice that adding graphs 
(b), (c), and (d) does not result in a graph that looks like (a). Thus, a 
significant number of the transpositions made are those that do not result 
in a pairing swap. Proportionally, this occurs more often in the torus and 
Klein bottle topologies than in the square and the sphere topologies. 



25 




0.5 



100 200 300 400 

Iteration 

Figure 22. The number of transpositions between vertices 
versus the iteration of the heat equation. 



In Figure [22| we restrict our counts to the number transpositions of ver- 
tices only. The values at the vertices dictates the values of the edges and 
the faces. The pattern that this graph follows is similar to Figure [2TTa). 
the graph for the number of transpositions generated from all of the sim- 
plices Thus, is seems that the behavior of the vertex-vertex transpositions 
is proportional to the behavior of simplex-simplex transpositions. 



7. Conclusion 

The objective of this project was to find a new way of measuring the 
distance between two functions. We develop a homotopy of functions based 
on the heat equation and investigated the behavior of the vineyard resulting 
from this homotopy. We computed this homotopy using four topologies, 
and found some trends that differed from our expectation. For example, 
the heat equation on elliptic surfaces is known to converge more quickly 
than in Euclidean space. Yet, we found the behavior of the sphere topology 
to contradict this fact. We believe that this contradiction arises from the 
discretization of the sphere. 

We began an investigation of the behavior of the heat equation homotopy; 
however, many directions still remain for where we can continue. We would 
like to explore if the observations made in Section [6] were specific to initial 
function no, or if we are observing behaviors that arise from the topologies 
used in the heat equation. Moreover, we would like to formally compare the 
matching we obtain with the bottleneck and Wasserstein matchings. 
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